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ABSTRACT 

Summary: Gene expression or metabolomics data generated from 
clinical settings are often associated with multiple metadata (i.e. diag- 
nosis, genotype, gender, etc.). It is of great interest to analyze and to 
visualize the data in these contexts. Here, we introduce INVEX— a 
novel web-based tool that integrates the server-side capabilities for 
data analysis with the browse-based technology for data visualization. 
INVEX has two key features: (i) flexible differential expression analysis 
for a wide variety of experimental designs; and (ii) interactive visual- 
ization within the context of metadata and biological annotations. 
INVEX has built-in support for gene/metabolite annotation and a fully 
functional heatmap builder. 

Availability and implementation: Freely available at http://www. 
invex.ca. 
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1 INTRODUCTION 

'Omics' technologies such as microarrays, next-generation 
sequencing and metabolomics are increasingly used in clinical 
studies. In many cases, a single dataset will be associated with 
multiple clinical parameters (metadata) such as diagnosis, geno- 
type, gender and so forth. It is of great interest to analyze and to 
visualize the data within these contexts as well as biological an- 
notations to enable one to capture dynamic changes that correl- 
ate with clinical factors. Linear models have proven to be a 
powerful and flexible approach for analysis of gene expression 
experiments (Smyth, 2005). However, users need to have a deep 
understanding of statistical concepts and R language to use this 
approach properly. Heatmaps have proven to be useful in visua- 
lizing expression data. Heatmaps coupled with clustering have 
become hallmarks for the presentation of gene expression data. 
However, most web-based tools provide only static images with 
limited support for user interactions. Interactive visualization is 
mostly limited to stand-alone tools or Java applet plugin 
(Pavlidis and Noble, 2003; Perez-Llamas and Lopez-Bigas, 
2011; Reich et aL, 2006; Saeed et aL, 2003; Saldanha, 2004). 
The rapid development of information technology, especially 
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HTML5 and JavaScript, has presented new opportunities to 
overcome this limitation (Miller et aL, 2013; Tan et aL, 2013). 
Here, we present INVEX (Integrative Visualization of Expression 
data) — an intuitive web-based tool that seamlessly integrates 
server-side capabilities for data analysis and annotation with 
the browse-based technology for data visualization. INVEX 
allows researchers to perform flexible data analysis and to visu- 
ally explore the results as interactive heatmaps within the context 
of associated metadata and biological annotations. 

2 IMPLEMENTATION 

INVEX is composed of two modules. The server-side module 
was implemented using the latest JavaServer Faces 2.0 technol- 
ogy. The data analysis was based on R and several packages 
from Bioconductor (Gentleman et aL, 2004). The client-side 
module was developed based on the HTML5 canvas and 
JavaScript using the jQuery library (www.jquery.com). INVEX 
has been tested using Google Chrome (5.0+), Firefox (3.0+) and 
Internet Explorer (9.0+) browsers. The performance of visual- 
ization depends on the users' computer. We recommend access- 
ing INVEX from a computer with at least a 1 5-inch screen and 2 
GB memory. 

3 APPLICATION EXAMPLE 

INVEX provides four example datasets each associated with 
multiple metadata — three gene expression datasets (Estrogen, 
Sepsis and TimeSeries) and one metabolomics data (Cachexia). 
Here, we illustrate the main features of INVEX using the Sepsis 
dataset. 

3.1 Data upload, annotation and analysis 

INVEX accepts an expression data table annotated with various 
metadata. The data can be uploaded as a tab-delimited text (.txt) 
or in its compressed format (.zip) (see INVEX 'Help' page for 
detailed instructions). Click the 'Start' menu on the home page to 
enter the analysis page. To access our test datasets, click the 
Try Examples' button on the bottom left. The four datasets 
are listed with detailed descriptions. The Sepsis data were gener- 
ated from a study comparing gene expression changes 
from Lipopolysaccharide (LPS)-induced inflammation to endo- 
toxin tolerance in human peripheral blood mononuclear cells 
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(PBMC) from four donors (Pena et al, 2011). There are three 
experimental conditions: control, LPS (pro-inflammatory) and 
LPS_LPS indicating two doses of LPS treatments within a day 
(leading to endotoxin tolerance). Thus, there are two types of 
metadata: Treatment and Donor. Select the Sepsis data and click 
'Yes' to upload the file. For the convenience of testing, INVEX 
sets the default parameters for the remaining steps — annotation, 
normalization, differential expression and enrichment analysis. 
Click 'Submit' to proceed in the first two steps. For differential 
analysis, INVEX can deal with two- or multiple-group, paired or 
block design, time-series, common-control and nested-compari- 
sons. Here, it is particularly interesting to compare genes that 
respond differently between the two treatments. To perform such 
analyses, we choose 'nested comparisons' between 'control vs. 
LPS' and 'control vs. LPS_LPS', and select 'Interactions only'. 
The analysis returns 1791 significant genes, which is reduced to 
251 after setting log 2 fold change cutoff to 1.0 (2-fold change in 
expression). Use 'Kyoto Encyclopedia of Genes and Genomes 
(KEGG)' for enrichment analysis. Finally, click 'Proceed to visu- 
alization' to enter the visualization page. 

3.2 Visual data exploration 

A screenshot of the visualization page is shown in Figure 1. 
There are four views — Overview, Focus view, Annotation view 
and Heatmap builder, with a top toolbar containing menus for 
adjusting resolution, colors, clustering and so forth. The 
Overview on the left provides a bird's-eye view of the expression 
profile of all significant genes (Fig. 1A). By default, genes are 
ordered by their adjusted P-values. Select 'Euclidean distance' to 
cluster genes for pattern discovery. Users can drag to select any 
region of interest for detailed inspection in the Focus view. The 
Focus view in the center shows the gene expression profiles of 
current interest with three adjustable resolutions (Fig. IB). The 
metadata, sample IDs and color keys are displayed on the top 
and bottom panels, respectively. Double click the 'Treatment' 
metadata row to sort all samples accordingly. On the right 
side, the Overall Enriched Themes pane shows enriched path- 
ways, P-values and matched gene numbers (Fig. 1C). Double 
click the name of the top hit 'cytokine-cytokine receptor 



interaction' to visualize the expression profiles of its 22 genes. 
The Enriched Themes in Current Focus bar allows users to iden- 
tify enriched functional modules for genes in Focus view. 

3.3 Building custom heatmaps 

Click the 'Heatmap builder' from the top toolbar to activate the 
Heatmap builder (Fig. ID). This pane is a 'playground' that 
allows users to easily create custom heatmaps to reveal specific 
features. Users can now double click to select a single gene or 
drag to select multiple genes from Focus view to Heatmap builder. 
Within the builder, users can double click to delete or drag to 
reorganize a row. Separators (blank rows) can be added and then 
dragged to specific positions to create visual cluster boundaries. 
When all genes of interest are included and organized, users can 
edit samples using the 'Edit samples' option. Finally, all heat- 
maps can be exported as portable network graphics (PNG) 
images labeled with metadata, sample IDs and color keys by 
using the 'Download' function from the top toolbar. 

4 CONCLUSIONS 

In the context of integrative analysis of expression data from 
clinical studies, there are two general scenarios: multiple expres- 
sion datasets collected for the same disease or multiple metadata 
collected for a single dataset. We have recently developed 
INMEX, a tool to support data analysis in the former scenario 
(Xia et al, 2013). In this article, we introduce INVEX that has 
been designed for the latter scenario. By coupling the conven- 
tional server-side functions for data analysis and annotation with 
the client-side visualization technologies, INVEX provides a pro- 
mising approach for developing efficient bioinformatics tool in 
the 'omics' era. 
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